Goto

Collaborating Authors

 hyperparameter optimization algorithm


Testing the Efficacy of Hyperparameter Optimization Algorithms in Short-Term Load Forecasting

arXiv.org Artificial Intelligence

Accurate forecasting of electrical demand is essential for maintaining a stable and reliable power grid, optimizing the allocation of energy resources, and promoting efficient energy consumption practices. This study investigates the effectiveness of five hyperparameter optimization (HPO) algorithms -- Random Search, Covariance Matrix Adaptation Evolution Strategy (CMA--ES), Bayesian Optimization, Partial Swarm Optimization (PSO), and Nevergrad Optimizer (NGOpt) across univariate and multivariate Short-Term Load Forecasting (STLF) tasks. Using the Panama Electricity dataset (n=48,049), we evaluate HPO algorithms' performances on a surrogate forecasting algorithm, XGBoost, in terms of accuracy (i.e., MAPE, $R^2$) and runtime. Performance plots visualize these metrics across varying sample sizes from 1,000 to 20,000, and Kruskal--Wallis tests assess the statistical significance of the performance differences. Results reveal significant runtime advantages for HPO algorithms over Random Search. In univariate models, Bayesian optimization exhibited the lowest accuracy among the tested methods. This study provides valuable insights for optimizing XGBoost in the STLF context and identifies areas for future research.


Towards Fair and Rigorous Evaluations: Hyperparameter Optimization for Top-N Recommendation Task with Implicit Feedback

arXiv.org Artificial Intelligence

The widespread use of the internet has led to an overwhelming amount of data, which has resulted in the problem of information overload. Recommender systems have emerged as a solution to this problem by providing personalized recommendations to users based on their preferences and historical data. However, as recommendation models become increasingly complex, finding the best hyperparameter combination for different models has become a challenge. The high-dimensional hyperparameter search space poses numerous challenges for researchers, and failure to disclose hyperparameter settings may impede the reproducibility of research results. In this paper, we investigate the Top-N implicit recommendation problem and focus on optimizing the benchmark recommendation algorithm commonly used in comparative experiments using hyperparameter optimization algorithms. We propose a research methodology that follows the principles of a fair comparison, employing seven types of hyperparameter search algorithms to fine-tune six common recommendation algorithms on three datasets. We have identified the most suitable hyperparameter search algorithms for various recommendation algorithms on different types of datasets as a reference for later study. This study contributes to algorithmic research in recommender systems based on hyperparameter optimization, providing a fair basis for comparison.


Scalable Nested Optimization for Deep Learning

arXiv.org Machine Learning

Gradient-based optimization has been critical to the success of machine learning, updating a single set of parameters to minimize a single loss. A growing number of applications rely on a generalization of this, where we have a bilevel or nested optimization of which subsets of parameters update on different objectives nested inside each other. We focus on motivating examples of hyperparameter optimization and generative adversarial networks. However, naively applying classical methods often fails when we look at solving these nested problems on a large scale. In this thesis, we build tools for nested optimization that scale to deep learning setups.


A greedy constructive algorithm for the optimization of neural network architectures

arXiv.org Machine Learning

In this work we propose a new method to optimize the architecture of an artificial neural network. The algorithm proposed, called Greedy Search for Neural Network Architecture, aims to minimize the complexity of the architecture search and the complexity of the final model selected without compromising the predictive performance. The reduction of the computational cost makes this approach appealing for two reasons. Firstly, there is a need from domain scientists to easily interpret predictions returned by a deep learning model and this tends to be cumbersome when neural networks have complex structures. Secondly, the use of neural networks is challenging in situations with compute/memory limitations. Promising numerical results show that our method is competitive against other hyperparameter optimization algorithms for attainable performance and computational cost. We also generalize the definition of adjusted score from linear regression models to neural networks. Numerical experiments are presented to show that the adjusted score can boost the greedy search to favor smaller architectures over larger ones without compromising the predictive performance.


On Using Hyperopt: Advanced Machine Learning Codementor

#artificialintelligence

In Machine Learning one of the biggest problem faced by the practitioners in the process is choosing the correct set of hyper-parameters. And it takes a lot of time in tuning them accordingly, to stretch the accuracy numbers. For instance lets take, SVC from well known library Scikit-Learn, sklearn.svm.SVC class implements the Support Vector Machine algorithm for classification which contains more than 10 hyperparameters, now adjusting all ten to minimize the loss is very difficult just by using hit and trial. Though Scikit-Learn provides Grid Search and Random Search, but the algorithms are brute force and exhaustive, however hyperopt implements distributed asynchronous algorithm for hyperparameter optimization. Introducing SMBO- Sequential Model Based Global Optimization.


Hyperparameter Optimization and Boosting for Classifying Facial Expressions: How good can a "Null" Model be?

arXiv.org Machine Learning

One of the goals of the ICML workshop on representation and learning is to establish benchmark scores for a new data set of labeled facial expressions. This paper presents the performance of a "Null model" consisting of convolutions with random weights, PCA, pooling, normalization, and a linear readout. Our approach focused on hyperparameter optimization rather than novel model components. On the Facial Expression Recognition Challenge held by the Kaggle website, our hyperparameter optimization approach achieved a score of 60% accuracy on the test data. This paper also introduces a new ensemble construction variant that combines hyperparameter optimization with the construction of ensembles. This algorithm constructed an ensemble of four models that scored 65.5% accuracy. These scores rank 12th and 5th respectively among the 56 challenge participants. It is worth noting that our approach was developed prior to the release of the data set, and applied without modification; our strong competition performance suggests that the TPE hyperparameter optimization algorithm and domain expertise encoded in our Null model can generalize to new image classification data sets.